Segment connection networks for corpus-based speech synthesis

نویسنده

  • Geert Coorman
چکیده

In this paper an efficient segment selection method for corpusbased speech synthesis is presented. Traditional unit-selectors use dynamic programming (DP) to find in a fully connected segment network the most appropriate segment sequence based on targetand concatenation costs. Instead of performing a full DP search, the presented unit-selector applies fast transformations on binary valued segment connection matrices. It is further also shown that this technique can be expanded to supra-segmental unit-selection without altering the segment size in the database.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Simple designing methods of corpus-based visual speech synthesis

This paper describes simple designing methods of corpus-based visual speech synthesis. Our approach needs only a synchronous real image and speech database. Visual speech is synthesized by concatenating real image segments and speech segments selected from the database. In order to automatically perform all processes, e.g. feature extraction, segment selection and segment concatenation, we simp...

متن کامل

High-Quality and Flexible Speech Synthesis with Segment Selection and Voice Conversion

Text-to-Speech (TTS) is a useful technology that converts any text into a speech signal. It can be utilized for various purposes, e.g. car navigation, announcements in railway stations, response services in telecommunications, and e-mail reading. Corpus-based TTS makes it possible to dramatically improve the naturalness of synthetic speech compared with the early TTS. However, no general-purpos...

متن کامل

An Investigation of DNN-Based Speech Synthesis Using Speaker Codes

Recent studies have shown that DNN-based speech synthesis can produce more natural synthesized speech than the conventional HMM-based speech synthesis. However, an open problem remains as to whether the synthesized speech quality can be improved by utilizing a multi-speaker speech corpus. To address this problem, this paper proposes DNN-based speech synthesis using speaker codes as a simple met...

متن کامل

Constructing a segment database for greek time domain speech synthesis

In this article, a methodology is presented regarding the design of a segment database for use with a time-domain speech synthesis system for the Greek language. The main issue of this process is the systematic generation of a corpus containing all possible instances of the segments for the specific language. Particular issues such as the phonetic coverage, the sentence selection as well as ite...

متن کامل

Conditional random fields for hierarchical segment selection in text-to-speech synthesis

In this paper we present the statistically motivated conditional random fields (CRF) approach to concatenative TTS. We use contextual CRFs for speech segment selection where we concatenate the selected segments to an acoustic speech waveform. The CRF approach is used in our corpus-based TTS system AVISS. The acoustic synthesis module consists of trained context dependent CRF models on a multi-l...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2006